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THE  PROBLEM 

Many  studies  have  suggested  the  possibility  of  predicting  operational 
performance  in  fleet  aviation  environments.  Research  is  currently  being 
conducted  to  develop  relevant  predictor  tests,  the  results  of  which  might 
aid  in  the  making  of  decisions  concerning  aircrew  selection,  training  pipe¬ 
line  assignment,  and  posttraining  aircraft  assignment.  The  current  approach 
is  to  use  an  automated  performance -based  test  battery  involving  cognitive 
and  psychomotor  functioning  to  predict  the  operational  performance  of 
fighter  pilots. 

FINDINGS 

Two  groups  of  pilots  who  were  completing  fleet  replacement  squadron 
training  (FRS)  for  the  F/A-18  were  tested  on  this  battery.  The  older,  more 
experienced  pilot  group  had  higher  FRS  grades  than  did  the  other  group,  but 
test  performance  between  these  two  groups  was  not  significantly  different. 
The  few  significant  correlations  between  test  measures  and  FRS  grades  were 
too  illogically  patterned  with  insufficient  quantity  or  strength  to  be 
reliable  predictors.  This  may  be  due  to  the  homogeneous  nature  of  each 
subject  group  in  terms  of  piloting  skills  and  abilities. 

RECOMMENDATIONS 


We  recommend  continued  research  of  this  type  utilizing  this  test 
battery  be  continued  with  some  changes.  Differences  in  test  performance 
among  both  similar  and  different  pilot- type  groups  should  be  investigated  as 
thoroughly  as  possible.  Replication  is  crucial  if  this  testing  methodology 
is  to  be  considered  for  purposes  of  selection  and  assignment  in  naval 
aviation.  Changes  in  test  structure  that  would  increase  testing  efficiency, 
or  apparatus  that  would  increase  or  at  least  stabilize  test  subject  effort 
should  be  investigated.  Also,  research  in  which  subjects  are  tested  before 
flight  training  and  then  followed  throughout  their  aviation  career  is 
needed.  Such  long-term  studies  would  allow  a  more  accurate  assessment  of 
the  predictive  ability  of  these  tests  choosing  the  best  candidate  for  a 
particular  aviation  platform  or  community. 
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INTRODUCTION 


Research  is  being  performed  at  the  Naval  Aerospace  Medical  Research 
Laboratory  (NAMRL)  to  predict  fleet  iviator  inflight  performance  using 
cognitive  and  psychomotor  tests.  The  goal  is  to  develop  relevant  predictor 
tests  that  will  reliably  relate  to  simulated  and  actual  flight  performance. 
Results  of  this  er.fort  may  aid  in  decisions  regarding  initial  pilot 
selection,  training  pipeline  assignment,  and  posttraining  aircraft  assignment. 

A  number  of  Navy  research  efforts  have  been  marginally  successful  in 
predicting  various  aspects  of  operational  aviator  performance.  For  example, 
peer  ratings  from  Navy  preflight  training  were  useful  in  identifying  both 
successful  and  unsuccessful  aviators  in  combat  in  Vietnam  (1) .  A  study  (2)  of 
F-4  F.epiacement  Air  Group  (RAG)  training  during  the  midsixties  resulted  in  a 
prediction  equation  that  could  have  possibly  reduced  RAG  attrition  from 
13.3%  to  8.3%.  When  F-4  Air  Combat  Maneuvering  (ACM)  on  the  Tactical 
Aircrew  Combat  Training  System  range  in  the  late  seventies  was  evaluated, 
the  authors  found  that  three  criterion  measures  (angle -of - tail ,  closing 
velocity,  and  indicated  air  speed)  were  significantly  related  to  ACM 
performance  (3) .  A  combination  of  psychological  tests  and  actual  flight 
performance  measures  successfuly  predicted  F-4  carrier  landing  performance 
(4) .  Others  found  that  a  relatively  small  set  of  RAG  criterion  scores 
reliably  predicted  final  overall  RAG  grade  (5).  The  two  most  promising 
scores  (carrier  qualification  power/nose  control,  and  offensive  ACM) 
accounted  for  73%  of  the  variance  with  the  final  overall  RAG  grade.  In  two 
subsequent  studias  (6,7),  the  authors  reported  that  a  regression  equation 
based  on  the  performance  of  an  East  coast  F-4  RAG  reliably  predicted 
performance  of  a  West  coast  F-4  RAG,  and  an  overall  experience  measure 
combined  with  seven  undergraduate  trainirg  grades  reliably  predicted  the 
overall  RAG  grade.  More  recently,  automated  dichotic  listening  and  psycho¬ 
motor  (cursor  tracking)  tests  predicted  some  elements  of  the  ACM  performance 
of  Marine  F-4  pilots  on  an  instrumented  training  range  (8). 

These  studies  suggest  the  possibility  of  predicting  operational 
performance  in  fleet  aviation  environments.  Our  approach  is  to  use 
automated  performance -based  tests  of  cognitive  and  psychomotor  functioning 
to  predict  aviator  performance  in  operational  settings.  This  report 
documents  an  attempt  to  use  an  automated  battery  of  performance -based  tests 
to  preaict  the  Fleet  Replacement  Squadron  (FRS)  flight  performance  of 
aviators  assigned  to  Squadron  VFA-106  at  Cecil  Field,  Florida,  who  were 
transitioning  to  the  F/A-18. 


METHODS 


SUBJECTS 

Sixty-seven  jet  fighter  pilots  performed  on  an  automated 
cognitive/psychomotor  test  battery.  Thirty-seven  subjects  were  Category  I 
pilots  who  iveraged  27.17  years  of  age  (SJ3  -  1.56)  with  an  average  of  446.59 
previous  flight  hours  (SD  -  296.03).  Many  of  these  subjects  had  been 
assigned  to  the  F/A-18  directly  after  completing  advanced  undergraduate 
flight  training.  Thirty  subjects  were  Category  II  pilots  who  averaged  30.93 
years  of  age  (SD  -  3.97)  with  an  average  of  1554.03  previous  flight  hours 
(SD  -  932.83).  Many  of  them  were  transitioning  to  the  F/A-18  from  other 
operational  fleet  aircraft,  typically  A-7s  or  F-4s.  Of  the  67  subjects 
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tested,  64  completed  the  fleet  replacement  squadron  (FRS)  program  of 
training  while  three  Category  I  pilots  failed. 

APPARATUS  AND  PROCEDURES 

Table  1  lists  the  various  tests  given,  the  sequence  of  their 
occurrence,  and  the  time  required  to  administer  each.  The  entire  series  was 
automated  using  an  Apple  lie  microcomputer,  an  Amdek  Color  I  Plus  monitor 
(CRT),  and  an  Apple  lie  numeric  keypad.  All  test  instructions  were 
presented  on  the  CRT  to  each  subject  before  the  s t  rt  of  each  test. 


TABLE  1.  Sequence,  Description,  and  Operating  Times  of  Automated  Tests. 


Presentation 

order 


Description 


Test  times  (min) 
individual/cumulative 


1. 

Single  psychomotor  task  (PMT),  stick  only  (S) 

10 

/ 

10 

2. 

Single  dichotic  listening  task  (DLT) 

23 

/ 

33 

3. 

First  multitask  (1,2  combined) 

05 

/ 

38 

4. 

Single  (PMT),  stick  &  rudder  (S&R) 

13 

/ 

51 

5. 

Second  multitask  (4,2  combined) 

05 

/ 

56 

6. 

Third  multitask  (4,2  combined) 

05 

/ 

61 

7. 

Single  PMT;  stick,  rudder,  &  throttle  (S&R&T) 

07 

/ 

68 

8. 

Second  single  PMT  (like  7,  S&R&T) 

04 

/ 

72 

9. 

Four.h  multitask  (8,2  combined) 

06 

/ 

78 

10. 

One  dimensional  compensatory  tracking  (ODCT) 

10 

/ 

88 

11. 

Absolute  difference  computation  (ADC) 

10 

/ 

98 

12. 

Fifth  multitask,  ODCT  &  ADC  (10,11  combined) 

10 

/ 

108 

Psvchomotor  Task  (PMT) 

The  psychomotor  tracking  task  required  subjects  to  maintain  first  one, 
then  two,  and  finally  three  randomly  displaced  cursors  on  fixed  targets  on 
the  CRT  by  manipulating  joysticks  and  foot  pedals.  Subjects  manipulated  one 
Measurement  Systems,  Inc.,  joystick  (stick  or  S) ,  located  at  the  front  seat 
edge,  with  their  right  hand  to  control  a  cursor  that  was  free  to  move 
throughout  a  rectangle  covering  approximately  two- thirds  of  the  CRT  screen. 
The  target  position  of  this  cursor  was  inui^ated  by  crosshairs  bisecting 
this  rectangular  area,  with  the  center  point  being  slightly  to  the  right  and 
above  the  center  of  the  screen.  The  stick  controlled  this  cursor  in  a 
backwards  (reversed)  manner,  that  is,  moving  the  stick  to  the  right  moved 
the  cursor  to  the  left  while  pulling  the  stick  toward  the  subject  moved  the 
cursor  up,  et  cetera.  Locally  produced  rudder  pedals  (rudder  or  R) , 
patterned  after  those  of  a  Systems  Research  Laboratories,  Inc.,  psychomotor 
test  device  and  located  directly  below  the  table  supporting  the  micro¬ 
computer  and  related  equipment,  were  used  to  control  a  cursor  that  moved 
horizontally  across  the  bottom  of  the  screen.  Pushing  the  left  pedal  moved 
this  cursor  to  the  right  while  pushing  the  right  pedal  moved  it  to  the  left. 
Another  Measurement  Systems  joystick  (throttle  or  T) ,  located  on  the  left 
seat  edge,  was  manipulated  by  the  subject's  left  hand  to  move  a  cursor 
vertically  on  the  left  side  of  the  screen.  Pulling  this  throttle  back  moved 
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this  cursor  down  while  pushing  it  forward  moved  it  up.  During  initial 
tasting  at  VFA-106,  the  development  of  the  throttle  portion  of  the  PMT  was 
incomplete.  Because  of  this,  not  all  subjects  were  tested  on  the  stick- 
rudder-and- throttle  (S&R&T)  task. 

Psychomotor  task  tests  1,  4,  and  7  (see  Table  1)  were  each  preceded  by 
a  3-min  practice  period.  The  6-min  testing  period  of  test  1  and  the  9-min 
testing  period  of  test  4  were  each  divided  into  3-min  testing  sessions 
separated  by  20-s  rest  periods.  Tests  7  and  8  had  a  single  3-min  testing 
period  each.  Psychomotor  task  scores  were  the  accumulated  total  of  absolute 
errors  from  an  ideal  target  position  in  pixels.  For  each  time  sampling  of 
cursor  position,  absolute  pixel  errors  were  assessed  along  each  dimension 
separately.  The  final  error  score  was  the  sum  of  all  the  samplings  made 
across  all  the  dimensions  represented  in  that  particular  test.  This  error- 
score  was  for  the  total  time  of  that  test,  except  for  tests  1  and  4  where 
only  the  first  3-min  session  and  the  first  two  3-min  sessions,  respectively, 
were  analyzed.  This  error  score  total  was  then  divided  by  the  number  of 
minutes  of  each  test  analyzed  to  generate  a  standard  rate  of  pixel  error  per 
1  min  of  test  time.  The  scores  of  tests  5  and  6  and  tests  7  and  8  were 
averaged  for  each  subject.  All  of  these  PMT  error  scores  were  then 
transformed  by  using  logarithms  to  base  10  in  order  to  reduce  skewness  and 
to  compensate  for  extreme  outliers,  thus  reducing  the  complexity  of  data 
analysis  while  retaining  all  the  data  points  available. 

Dichotic  Listening  Task  (DLT) 

The  DLT  was  patterned  after  a  test  described  by  Gopher  and  Kahneman 
(9),  subsequently  modified  by  Griffin  and  Mosko  (10),  and  then  automated  at 
NAMRL.  The  DLT  is  an  auditorally  presented  series  of  letter-digit  string 
sets.  Two  Jameco  JE  520-AP  Voice  Synthesizers  were  used  to  present  these 
letter-digit  strings  at  the  rate  of  0.7  s  per  item  over  binaural  headphones 
to  each  subject  at  a  listening  level  of  72  dB/I.eq  (re:20  pa).  Subjects  were 
instructed  on  each  trial  as  to  which  ear  to  attend  to,  first  for  a  series  of 
16  pairs  of  letters  and/or  numbers  (Part  I)  and  then  again  for  a  series  of  6 
more  pairs  (Part  II) .  A  visual  example  of  a  typical  trial  is  given  in  Table 
2.  Subjects  were  to  indicate  the  digits  (0-9)  presented  to  the  designated 
ear  in  the  order  of  their  occurrence.  Subjects  responded  with  the  left  hand 
using  a  separate  keypad  placed  immediately  in  front  and  slightly  left  of 
center.  Responses  could  be  made  while  the  items  were  being  presented  or 
during  an  interval  of  1.4  s  after  the  presentation  of  the  last  letter  and/or 
number  pair.  Five  correct  responses  were  possible  on  Part  I  and  four  on 
Part  II  of  each  trial,  which  together  required  21  s  to  complete.  The  test 
was  preceded  by  six  auditorally  presented  practice  trials  that  incorporated 
immediate  performance  feedback  visually  indicating  the  letters  and  digits 
presented  and  the  subjects'  keypad  responses.  Subjects  also  completed  three 
multiple-choice  questions  before  the  start  of  this  test  to  make  certain  chat 
they  understood  the  concept  of  the  DLT. 

The  DLT  performance  measure  was  the  number  of  incorrect  responses  made 
over  24  trials  in  which  a  total  of  216  correct  responses  were  possible.  The 
number  of  correct  responses  made  was  divided  by  2  and  then  subtracted  from 
109  (half  the  total  possible  correct  plus  1)  in  order  to  make  it  directly 
comparable  to  the  multitask  DLT  measures.  This  new  adjusted  error  score  was 
then  transformed  by  using  logarithms  to  base  10  to  adjust  for  both  skewness 
and  extreme  outliers  as  was  done  with  the  PMT  results. 


TABLE  2.  Visual  Example  of  a  DLT  Trial. 


PART  I  Left  Ear  R8NSMY2GB7  FL6RL5 

"Right"  (Vocal  Channel  'attend'  command) 

Right  Ear  YL3SR4FZ9XF0FN1L 


PART  II  Left  Ear  B  F  4  3  7  9 

"Left"  (Vocal  Channel  'attend'  command) 
Right  Ear  G  L  1  5  6  2 


Multitask  PMT/DLT 

In  all  of  the  multitask  conditions,  subjects  performed  both  the  DLT  and 
PMT  simultaneously  (a  12-trial  DLT  and  a  4.5-min  PMT) .  During  the  first 
multitask  condition  (test  3),  subjects  performed  the  DLT  and  the  stick-only 
PMT  (S).  During  the  next  two  multitask  conditions  (tests  5  &  6) ,  subjects 
performed  the  DLT  and  the  stick-and-rudder  PMT  (S&R)  using  their  right  hand 
and  both  feet  to  control  the  central  joystick  and  rudder  pedals  and  their 
left  hand  to  make  keypad  responses  to  the  DLT  input.  During  the  final 
multitask  condition  (test  9),  subjects  performed  the  DLT  and  the  stick- 
rudder-and- throttle  PMT  (S&R&T) .  In  this  most  elaborate  combination, 
subjects  used  their  right  hand  and  both  feet  to  control  the  central  joystick 
and  rudder  pedals  as  before  but,  in  addition,  used  their  left  hand  to 
control  the  throttle  joystick  and  voiced  their  DLT  responses  using  a 
microphone  attached  to  the  headphones.  These  vocal  responses  were  tape- 
recorded  for  subsequent  analysis  and  hand  scoring.  Before  the  start  of  the 
various  multitask  combinations,  subjects  were  instructed  to  perform  each 
task  equally  well. 

Performance  measures  for  the  PMT  .■'nd  DLT  in  these  multitask  conditions 
were  identical  to  those  of  the  single  tasks  alone  except  for  a  different 
length  of  PMT  testing  and  the  presentation  of  12  DLT  trials  in  which  a  total 
of  108  correct  responses  were  possible.  The  DLT  began  15  s  after  the  PMT 
and  ended  just  before  the  PMT,  with  PMT  errors  being  recorded  ror  the  final 
4  min  of  that  test.  Figure  1  shows  a  subject  performing  the  multitask 
PMT/DLT  on  the  automated  test  apparatus. 

One  Dimensional  Compensatory  Tracking  (ODCT^ 

The  ODCT  in  general  has  been  described  (11)  as  follows.  The  task 
requires  subjects  to  center  a  square- shaped  cursor  inside  of  an  elongated 
rectangle  by  making,  with  their  right  hand,  left  and  right  movements  of  a 
joystick  centered  on  the  front  seat  edge.  The  cursor  is  driven  by  a  forcing 
function,  which  increases  centering  effort  with  distance  from  center. 

During  this  phase  of  the  task,  subjects  received  three  2-min  trials,  with 
each  trial  separated  by  a  30-s  rest  period.  The  test  measure  for  the  ODCT 
was  total  pixel  deviation  error  averaged  over  the  three  single- task  trials. 
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Figure  1.  Automated  psvchomotor/dlchotic  listening  task. 


Absolute  Difference  Computation  (ADC) 

Randomly  selected  digits  between  1  and  9  were  presented  inside  a  small 
square  in  the  middle  of  the  CRT  to  subjects  who  then  determined  the  absolute 
difference  between  the  digit  currently  on  display  on  the  CRT  and  the  last 
digit  displayed  previously.  The  subjects  then  pressed  the  corresponding 
digit-key  on  the  keypad  with  their  left  hand  as  quickly  as  possible 
resulting  in  the  display  of  another  number  for  computation.  Identical 
digits  were  not  allowed  to  repeat.  Only  the  digit  responses  1,  2,  3,  and  4 
were  possible.  Subjects  received  three  2-min  trials,  with  each  trial 
separated  by  20  s  of  rest.  Performance  measures  for  the  ADC  were  the  number 

I  of  correct  responses  made  and  the  average  reaction  time  of  these  correct 

responses,  both  averaged  over  the  three  ADC  trials. 

i  Dual  Task  ODCT/ADC 

During  this  phase  of  testing,  subjects  performed  both  the  ODCT  and  ADC 
concurrently.  The  digits  for  the  difference  task  were  centered  just  above 
the  tracking  task.  The  subjects  controlled  the  tracking  task  joystick  with 
their  right  hand  and  made  keypad  responses  to  the  difference  task  with  their 
left  hand.  Subjects  were  instructed  to  perform  each  task  equally  well. 
Subjects  received  three  2-min  trials  with  each  trial  separated  by  30  s  of 
|  rest.  Test  measures  for  the  dual  task  ODCT/ADC  were  the  same  as  those  for 

the  single  tasks.  Figure  2  shows  a  CRT  screen  display  from  the  dual  task 
I  ODCT/ADC . 


i 
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Figure  2.  The  CRT  display  for  dual -task  ODCT/ADC, 


Operational  Performance  Criteria 

Aviators  undergoing  FRS  training  at  VFA-106  receive  a  series  of  grades 
comparing  their  performance  to  that  of  others  undergoing  this  training. 
Specifically,  they  are  graded  on  their  performance  in  the  Transition  (TRAN) , 
Basic  Fighter  Maneuvering  (BFM) ,  Gunnery  (GUN) ,  Visual  Intercept  (VID) , 
Fighter  Weapons  Training  (FWT) ,  Navigation  (NAV) ,  Light  Attack  (LAT) ,  Strike 
(STK) ,  and  Carrier  Qualification  (CQ)  portions  of  the  VFA-106  training 
program.  The  overall  grade  (OAG)  is  an  equally  weighted  composite  of  all  of 
these  individual  FRS  grades. 


RESULTS 


AVIATOR  FRS  PERFORMANCE 

Using  one-way  analysis  of  variance  (ANOVA) ,  we  found  significant 
differences  between  Category  I  and  II  aviators  on  most  of  the  FRS  grades 
(Table  3).  For  every  FRS  grade,  the  average  score  of  Category  II  pilots  was 
higher  than  that,  of  Category  I  pilots.  At  least  two  explanations  are 
plausible.  First,  Category  II  pilots  may  have  performed  bette"  on  these 
measures  because  they  had  more  accumuiated  flight  hours  in  and  out  of  jet 
aircraft.  A  second  reason  could  be  that  some  element  of  the  scoring 
procedure  had  biased  these  scores  in  favor  of  the  Category  II  pilots.  This 
could  be  due  to  the  fact  that  the  Category  II  pilots  were  scored  by  their 
peer  group  while  the  Category  I  pilots  were  not. 


TABLE  3.  Descriptive  Statistics  for  FRS  Grades:  Category  I  and  II 
Aviators , 


RS 

grade 

Cateeorv  I 

Cateeorv  II 

Z 

E 

Mean 

SD 

n 

Mean 

SD 

E 

TRAN 

3.08 

0.03 

35 

3.10 

0.03 

30 

11.46 

.0016 

BFM 

3.12 

0.06 

35 

3.15 

0.05 

30 

5.46 

.0214 

GUN 

3.08 

0,04 

34 

3.10 

0.04 

29 

1.37 

>.05 

VID 

3.10 

0.04 

35 

3.12 

0.07 

30 

1.74 

>.05 

FWT 

3.11 

0.04 

34 

3.14 

0.04 

30 

10.63 

.0022 

NAV 

3.09 

0.06 

35 

3.12 

0.05 

29 

5.66 

.0194 

LAT 

3.08 

0.04 

35 

3.12 

0.03 

29 

13.55 

.0008 

STK 

3.07 

0.03 

35 

3.10 

0.02 

29 

14.71 

.0005 

CQ 

2.94 

0.12 

34 

3.07 

0.14 

27 

14.65 

.0006 

OAG 

3.08 

0.02 

34 

3.11 

0.03 

29 

27.53 

.00003 

AVIATOR  TEST  BATTERY  PERFORMANCE 

In  this  study,  t.wo  identical  testing  stations  were  utilized,  and 
subjects  were  random], y  assigned  to  one  of  these  two.  These  two  stations  did 
not  differ  significantly  in  terms  of  pilot  test  performance.  Table  4 
presents  descriptive  statistics  on  the  performance  of  both  Category  I  and 
Category  II  FRS  pilots  on  the  psychomotor  and  cognitive  tests  as  well  as 
flight  hours  and  age.  Not  all  test  scores  were  obtained  for  all  subjects 
due  to  scheduling  problems  and  apparatus  malfunctions.  One-way  ANOVAs 
showed  no  significant  differences  between  Category  I  and  II  aviators  on  any 
of  these  tests.  They  did  differ  significantly,  however,  in  age  (£(1,  65)  ~ 
46.52,  p  <  .00001)  and  flight  hours  (£(1,  63)  -  26.63,  £  <  .00004).  As  noted 
earlier,  the  two  categories  of  pilots  were  identified  and  treated 
differently  at  the  FRS.  Because  of  this  and  the  differences  found  in  FRS 
grades,  we  analyzed  the  data  for  each  pilot  category  group  separately. 

For  both  pilot  groups,  the  mean  number  of  errors  made  on  the  PMT, 
regardless  of  motor  complexity  level,  decreased  when  the  DLT  was  added. 
Two-tailed  t  tests  for  dependent  samples  showed  this  difference  to  be 
significant  for  all  conditions  (all  t  values  >  6.24,  all  p  values  <  .01)  and 
would  indicate  that  the  subjects  performed  better  when  the  PMT  and  DLT  were 
combined.  A  more  parsimonious  explanation  involves  the  fact  that,  as  the 
DLT  was  brought  on  line  with  the  PMT,  the  particular  microcomputer  used 
could  not  maintain  the  level  of  cursor  positioning  difficulty  attained 
previously  due  to  processor  overload.  This  overloading  also  produced  a 
possible  reduction  in  error  sampling  rate  as  test  complexity  increased.  An 
apparent  decrease  in  testing  efficiency  does  not  invalidate  the  usefulness 
of  these  results  or  methodology  in  predicting  flight  performance,  but  it 
does  call  for  a  possible  change/upgrade  in  computer  equipment.  In  this 
regard,  using  Friedman  two-way  ANOVAs  (12),  we  found  that  for  both  category 
groups,  subjects  made  significantly  more  errors  as  PMT  complexity  increased 
during  both  the  unitask  and  multitask  conditions  (all  ANOVA  Chi-square's  > 
34.10,  all  df  -  2 ,  all  p  values  <  .01). 
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TABLE  A.  Descriptive  Statistics  of  Tests:  Category  I  and  II  Aviators. 


Test  measure 

Category  I _ 

_ Category  II 

Mean 

££ 

n 

Mean 

££ 

n 

Unitask  DLT 

0.70 

0.25 

35 

0.71 

0.21 

30 

Multitask  DLT  w/(S) 

0.64 

0,39 

36 

0.67 

0.35 

30 

Multitask  DLT  w/(S&R) 

0,76 

0.29 

36 

0.72 

0.28 

30 

Multitask  DLT  w/(S&R&T) 

0.89 

0.39 

18 

0.84 

0.22 

18 

Unitask  PMT  (S) 

3.05 

0.11 

35 

3.01 

0.16 

29 

Multitask  PMT  (S)  w/DLT 

2.74 

0.14 

37 

2.73 

0.17 

30 

Unitask  PMT  (S&R) 

3.38 

0.10 

35 

3.40 

0.15 

29 

Multitask  PMT  (S&R)  w/DLT 

3.15 

0.17 

37 

3.13 

0.17 

30 

Unitask  PMT  (S&R&T) 

3.55 

0.08 

18 

3.57 

0.20 

19 

Multitask  PMT  (S&R6.T)  w/DLT 

3.38 

0.13 

18 

3.35 

0.15 

18 

Single  tracking  (ODCT) 

22.61 

5.39 

29 

24.21 

9.58 

12 

Sgle  abs  diff,  (ADC) 

55.78 

13.93 

29 

59.03 

11.32 

12 

Sgle  abs  diff.  (ADC)  RT 

2.35 

0.44 

29 

2.20 

0.35 

12 

Dual  tracking,  (ODCT) 

31.81 

9.17 

29 

38.95 

14.85 

12 

Dual  abs  diff.  (ADC) 

60 . 28 

15.12 

29 

62.53 

7.69 

12 

Dual  abs  diff.  (ADC)  RT 

2.24 

0.58 

29 

2.05 

0.22 

12 

TEST  BATTERY/FRS  PERFORMANCE  CORRELATIONS 

Individual  Pearson  product-moment  correlations  were  performed  among  the 
various  test  battery  measures  and  the  FRS  grades.  Of  these  160  correlations, 
only  5  (3%)  were  significant  (p  -  .05)  for  the  Category  I  aviators  while 
only  12  (8%)  were  significant  for  the  Category  II  aviators.  Generally,  this 
would  be  expected  merely  by  chance.  Also,  the  arrangement  of  signif leant 
correlations  was  unique  for  each  pilot  category  and,  for  the  most  part,  did 
not  follow  any  logically  obvious  pattern.  This  became  evident  when 
attempting  to  explain  both  the  dissimilar  significant  correlations  found 
between  very  similar  tests,  some  of  which  were  in  the  direction  opposite  to 
that  expected,  and  the  lack  of  similar  significant  correlations  among  FRS 
grades  that  appeared  to  be  tapping  into  similar  piloting  skills.  As  a  check 
of  true  significance,  we  performed  canonical  correlation  analysis  (13),  a 
generalization  of  multiple  regression  analysis  for  any  number  of  dependent 
variables,  on  both  pilot  categories.  Given  the  nature  of  this  analysis, 
only  test  battery  measures  with  an  n  >  20  were  included,  and  the  OAG  score 
was  excluded  due  to  its  redundant  composite  nature.  Neither  the  Category  I 
(Canonical  R  -  .98,  Chi-square  -  134.26,  d£  -  117,  p  -  .131)  nor  the 
Category  II  (Canonical  R  -  .86,  Chi-square  -  72.19,  df  «  63,  p  -  .200) 
results  were  significant.  Also,  no  significant  correlations  were  found 
between  either  age  or  flight  hours  and  any  of  the  FRS  grades  for  either 
pilot  category. 

CONCLUSIONS  AND  RECOMMENDATIONS 

The  results  of  this  study  indicate  virtually  no  significant 
relationships  between  performance  on  this  test  battery  and  FRS  performance 
for  this  particular  type  of  aviator.  Specifically,  those  cognitive  and 
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psychomotor  abilities  ass  uned  to  be  measured  by  this  test  battery  were  not 
significant  factors  in  FRS  performance.  Quite  possibly,  for  any  group  of 
experienced  pilots,  results  from  such  a  battery  would  not  correlate 
signif icantly  with  such  operational  performance  measures.  This  would  most 
likely  be  due  to  the  fact  that  the  skill  and  ability  levels  found  in  such  a 
subject  group  would  have  already  been  significantly  equalized  across 
subjects  as  a  consequence  of  common  selection,  training,  and  flight 
experiences.  If  so,  whatever  test  performance  variance  was  found  within 
this  subject  group  would  be  mostly  due  to  factors  different  from  those 
producing  the  variance  seen  in  the  FRS  grades. 

We  recommend  continued  research  of  this  type  utilizing  this  test 
battery  be  continued  *ith  some  changes.  Differences  in  test  performance 
among  both  similar  and  different  pilot- type  groups  should  be  investigated  as 
thoroughly  as  possible.  Replication  is  crucial  if  this  testing  methodology 
is  to  be  considered  for  purposes  of  selection  and  assignment  in  naval 
aviation.  Changes  in  test  structure  that  would  increase  testing  efficiency, 
or  apparatus  that  would  Increase  or  at  least  stabilize  test  subject  effort 
should  be  investigated.  Also,  research  in  which  subjects  are  tested  before 
flight  training  and  then  followed  throughout  their  aviation  career  is 
needed.  Such  long-term  studies  would  allow  a  more  accurate  assessment  of 
the  predictive  ability  of  these  tests  choosing  the  best  candidate  for  a 
particular  aviation  platform  or  community. 
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